Probabilistic Approach Using Joint Clean and Noisy i-Vectors Modeling for Speaker Recognition
نویسندگان
چکیده
Additive noise is one of the main challenges for automatic speaker recognition and several compensation techniques have been proposed to deal with this problem. In this paper, we present a new ”data-driven” denoising technique operating in the i-vector space based on a joint modeling of clean and noisy i-vectors. The joint distribution is estimated using a large set of i-vectors pairs (clean i-vectors and their noisy versions generated artificially) then integrated in an MMSE estimator in the test phase to compute a ”cleaned-up” version of noisy test ivectors. We show that this algorithm achieves up to 80% of relative improvement in EER. We also present a version of the proposed algorithm that can be used to compensate multiple ”unseen” noises. We test this technique on the recently published SITW database and show a significant gain compared to the baseline system performance.
منابع مشابه
Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space
In the last few years, the use of i-vectors along with a generative back-end has become the new standard in speaker recognition. An i-vector is a compact representation of a speaker utterance extracted from a low dimensional total variability subspace. Although current speaker recognition systems achieve very good results in clean training and test conditions, the performance degrades considera...
متن کاملThroat Microphone for Speaker Recognition Using AANN
In this paper, we have analyzed the performance of speaker recognition system based on features extracted from the speech recorded using throat microphone in clean and noisy environment. In general, clean speech performs better for speaker recognition system. Speaker recognition in noisy environment, using transducer held at the throat results in a signal that is clean even in noisy. This speak...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملProbabilistic Approach Using Joint Long and Short Session i-Vectors Modeling to Deal with Short Utterances for Speaker Recognition
Speaker recognition with short utterance is highly challenging. The use of i-vectors in SR systems became a standard in the last years and many algorithms were developed to deal with the short utterances problem. We present in this paper a new technique based on modeling jointly the i-vectors corresponding to short utterances and those of long utterances. The joint distribution is estimated usi...
متن کاملCNN-Based Joint Mapping of Short and Long Utterance i-Vectors for Speaker Verification Using Short Utterances
Text-independent speaker recognition using short utterances is a highly challenging task due to the large variation and content mismatch between short utterances. I-vector and probabilistic linear discriminant analysis (PLDA) based systems have become the standard in speaker verification applications, but they are less effective with short utterances. To address this issue, we propose a novel m...
متن کامل